Goto

Collaborating Authors

 string diagram


Bayesian Networks, Markov Networks, Moralisation, Triangulation: a Categorical Perspective

arXiv.org Artificial Intelligence

Moralisation and Triangulation are transformations allowing to switch between different ways of factoring a probability distribution into a graphical model. Moralisation allows to view a Bayesian network (a directed model) as a Markov network (an undirected model), whereas triangulation addresses the opposite direction. We present a categorical framework where these transformations are modelled as functors between a category of Bayesian networks and one of Markov networks. The two kinds of network (the objects of these categories) are themselves represented as functors from a `syntax' domain to a `semantics' codomain. Notably, moralisation and triangulation can be defined inductively on such syntax via functor pre-composition. Moreover, while moralisation is fully syntactic, triangulation relies on semantics. This leads to a discussion of the variable elimination algorithm, reinterpreted here as a functor in its own right, that splits the triangulation procedure in two: one purely syntactic, the other purely semantic. This approach introduces a functorial perspective into the theory of probabilistic graphical models, which highlights the distinctions between syntactic and semantic modifications.


Causal Abstractions, Categorically Unified

arXiv.org Machine Learning

We present a categorical framework for relating causal models that represent the same system at different levels of abstraction. We define a causal abstraction as natural transformations between appropriate Markov functors, which concisely consolidate desirable properties a causal abstraction should exhibit. Our approach unifies and generalizes previously considered causal abstractions, and we obtain categorical proofs and generalizations of existing results on causal abstractions. Using string diagrammatical tools, we can explicitly describe the graphs that serve as consistent abstractions of a low-level graph under interventions. We discuss how methods from mechanistic interpretability, such as circuit analysis and sparse autoencoders, fit within our categorical framework. We also show how applying do-calculus on a high-level graphical abstraction of an acyclic-directed mixed graph (ADMG), when unobserved confounders are present, gives valid results on the low-level graph, thus generalizing an earlier statement by Anand et al. (2023). We argue that our framework is more suitable for modeling causal abstractions compared to existing categorical frameworks. Finally, we discuss how notions such as $τ$-consistency and constructive $τ$-abstractions can be recovered with our framework.


A Diagrammatic Calculus for a Functional Model of Natural Language Semantics

arXiv.org Artificial Intelligence

In this paper, we study a functional programming approach to natural language semantics, allowing us to increase the expressiveness of a more traditional denotation style. We will formalize a category based type and effect system to represent the semantic difference between syntactically equivalent expressions. We then construct a diagrammatic calculus to model parsing and handling of effects, providing a method to efficiently compute the denotations for sentences.


Efficient Generation of Parameterised Quantum Circuits from Large Texts

arXiv.org Artificial Intelligence

Quantum approaches to natural language processing (NLP) are redefining how linguistic information is represented and processed. While traditional hybrid quantum-classical models rely heavily on classical neural networks, recent advancements propose a novel framework, DisCoCirc, capable of directly encoding entire documents as parameterised quantum circuits (PQCs), besides enjoying some additional interpretability and compositionality benefits. Following these ideas, this paper introduces an efficient methodology for converting large-scale texts into quantum circuits using tree-like representations of pregroup diagrams. Exploiting the compositional parallels between language and quantum mechanics, grounded in symmetric monoidal categories, our approach enables faithful and efficient encoding of syntactic and discourse relationships in long and complex texts (up to 6410 words in our experiments) to quantum circuits. The developed system is provided to the community as part of the augmented open-source quantum NLP package lambeq Gen II.


Towards a Categorical Foundation of Deep Learning: A Survey

arXiv.org Artificial Intelligence

The unprecedented pace of machine learning research has lead to incredible advances, but also poses hard challenges. At present, the field lacks strong theoretical underpinnings, and many important achievements stem from ad hoc design choices which are hard to justify in principle and whose effectiveness often goes unexplained. Research debt is increasing and many papers are found not to be reproducible. This thesis is a survey that covers some recent work attempting to study machine learning categorically. Category theory is a branch of abstract mathematics that has found successful applications in many fields, both inside and outside mathematics. Acting as a lingua franca of mathematics and science, category theory might be able to give a unifying structure to the field of machine learning. This could solve some of the aforementioned problems. In this work, we mainly focus on the application of category theory to deep learning. Namely, we discuss the use of categorical optics to model gradient-based learning, the use of categorical algebras and integral transforms to link classical computer science to neural networks, the use of functors to link different layers of abstraction and preserve structure, and, finally, the use of string diagrams to provide detailed representations of neural network architectures.


String Diagrams with Factorized Densities

arXiv.org Artificial Intelligence

Statisticians and machine learners analyze observed data by synthesizing models of those data. These models take a variety of forms, with several of the most widely used being directed graphical models, probabilistic programs, and structural causal models (SCMs). Applications of these frameworks have included cognitive modeling [7, 20], simulation-based inference [9], and model-based planning [12, 21]. Unfortunately, the richer the model class, the weaker the mathematical tools available to reason rigorously about it: SCMs built on linear equations with Gaussian noise admit easy inference, while graphical models have a clear meaning and a wide array of inference algorithms but encode a limited family of models. Probabilistic programs can encode any computably sampleable distribution, but the definition of their densities commonly relies on operational analogies with directed graphical models.


Towards Transparency in Coreference Resolution: A Quantum-Inspired Approach

arXiv.org Artificial Intelligence

Guided by grammatical structure, words compose to form sentences, and guided by discourse structure, sentences compose to form dialogues and documents. The compositional aspect of sentence and discourse units is often overlooked by machine learning algorithms. A recent initiative called Quantum Natural Language Processing (QNLP) learns word meanings as points in a Hilbert space and acts on them via a translation of grammatical structure into Parametrised Quantum Circuits (PQCs). Previous work extended the QNLP translation to discourse structure using points in a closure of Hilbert spaces. In this paper, we evaluate this translation on a Winograd-style pronoun resolution task. We train a Variational Quantum Classifier (VQC) for binary classification and implement an end-to-end pronoun resolution system. The simulations executed on IBMQ software converged with an F1 score of 87.20%. The model outperformed two out of three classical coreference resolution systems and neared state-of-the-art SpanBERT. A mixed quantum-classical model yet improved these results with an F1 score increase of around 6%.


Higher-Order DisCoCat (Peirce-Lambek-Montague semantics)

arXiv.org Artificial Intelligence

DisCoCat [1, 2] (Categorical Compositional Distributional) models are structure-preserving maps which send grammatical types to vector spaces and grammatical structures to linear maps. Concretely, the meaning of words is given by tensors with shapes induced by their grammatical types; the meaning of sentences is given by contracting the tensor networks induced by their grammatical structure. String diagrams provide an intuitive graphical language to visualise and reason formally about the evaluation of DisCoCat models; which can be formalised in terms of functors F: G Vect from the category generated by a formal grammar G to the monoidal category Vect of vector spaces and linear maps with the tensor product [3, 2.5]. Although this functorial definition applies equally to any kind of formal grammar, most of the DisCoCat literature focuses on pregroup grammars and more generally on categorial grammars such as the Lambek calculus [4, 5] and combinatory categorial grammars (CCG) [6]. In that case, G is a closed monoidal category and the DisCoCat models F: G Vect map grammatical structures to the closed structure of Vect in a canonical way. In practice, this means that once the meaning of each word is computed from a dataset, the meaning of any new grammatical sentence can be computed automatically from its grammatical structure.


Categorical Foundations of Explainable AI: A Unifying Theory

arXiv.org Machine Learning

Explainable AI (XAI) aims to address the human need for safe and reliable AI systems. However, numerous surveys emphasize the absence of a sound mathematical formalization of key XAI notions -- remarkably including the term "explanation" which still lacks a precise definition. To bridge this gap, this paper presents the first mathematically rigorous definitions of key XAI notions and processes, using the well-funded formalism of Category theory. We show that our categorical framework allows to: (i) model existing learning schemes and architectures, (ii) formally define the term "explanation", (iii) establish a theoretical basis for XAI taxonomies, and (iv) analyze commonly overlooked aspects of explaining methods. As a consequence, our categorical framework promotes the ethical and secure deployment of AI technologies as it represents a significant step towards a sound theoretical foundation of explainable AI.


Active Inference in String Diagrams: A Categorical Account of Predictive Processing and Free Energy

arXiv.org Artificial Intelligence

We present a categorical formulation of the cognitive frameworks of Predictive Processing and Active Inference, expressed in terms of string diagrams interpreted in a monoidal category with copying and discarding. This includes diagrammatic accounts of generative models, Bayesian updating, perception, planning, active inference, and free energy. In particular we present a diagrammatic derivation of the formula for active inference via free energy minimisation, and establish a compositionality property for free energy, allowing free energy to be applied at all levels of an agent's generative model. Aside from aiming to provide a helpful graphical language for those familiar with active inference, we conversely hope that this article may provide a concise formulation and introduction to the framework.